Generation of a privilege graph to represent data access authorizations

ABSTRACT

The technology disclosed herein enables generation of a privilege graph to represent data access authorizations. In a particular embodiment, a method includes extracting identity information for a plurality of users from a plurality of identity environments and privilege information from a plurality of data environments. The method further includes forming subgraphs for the identity environments and the data environments from the identity information and the privilege information. The method also includes translating the subgraphs into a canonical schema and, after translating the subgraphs, combining the subgraphs into the privilege graph.

RELATED APPLICATIONS

This application is related to and claims priority to U.S. ProvisionalPatent Application 63/073,751, titled “GENERATION OF A PRIVILEGE GRAPHTO REPRESENT DATA ACCESS AUTHORIZATIONS,” filed Sep. 2, 2020, and whichis hereby incorporated by reference in its entirety.

BACKGROUND

Modern enterprises use numerous data environments to store, manage,and/or process data and those environments may be managed by differentsystems, applications, and/or platforms from different providers andeach may use its own data repository (e.g., database). For instance,different departments may employ different database systems depending onthe features offered by the respective system (e.g., accounting may usea first database system while human resources uses a second). In somecases, a single department may itself use multiple platforms for datarepositories depending on the capabilities of each platform even if theplatforms manage similar data sets. For example, human resources may useone platform to onboard and terminate employees from the enterprisewhile another platform is used to handle employees' compensation andbenefits. The repositories may be hosted local to the enterprise (i.e.,at one or more of the enterprise's own facilities) or may be cloud basedand hosted by third parties. Likewise, the cardinality of the dataenvironments and the data therein can be very high (on the order ofthousands of individual elements, such as data tables, to which a usercan potentially access), which makes it very difficult (if notimpossible) for a human administrator to track which data can beaccessed by which users.

SUMMARY

The technology disclosed herein enables generation of a privilege graphto represent data access authorizations. In a particular embodiment, amethod includes extracting identity information for a plurality of usersfrom a plurality of identity environments and privilege information froma plurality of data environments. The method further includes formingsubgraphs for the identity environments and the data environments fromthe identity information and the privilege information. The method alsoincludes translating the subgraphs into a canonical schema and, aftertranslating the subgraphs, combining the subgraphs into the privilegegraph.

In some embodiments, the method includes displaying the privilege graphto an administrator authorized to view the privilege graph.

In some embodiments, forming the subgraphs includes creating a user nodefor a user of the plurality of users and sequentially connecting theuser node to one or more attribute nodes that each represent anattribute of the user indicated in the identity information. In thoseembodiments, upon reaching a last attribute node of the one or moreattribute nodes, the method may include connecting the last attributenode to a privileges node and connecting the privileges node to one ormore nodes of authorized data environments of the plurality of dataenvironments that the user is authorized to access. The one or morenodes of authorized data environments each may represent data or afeature that the user is authorized to access.

In some embodiments, translating the subgraphs includes, for attributenodes of the subgraphs, changing attribute labels representingattributes of a user to canonical labels defined by the canonicalschema.

In some embodiments, combining the subgraphs includes, for an attributerepresented by attribute nodes in multiple subgraphs, generating acommon attribute node and migrating connections with the attribute nodesto the common attribute node. In those embodiments, after migrating theconnections, the method may include identifying replicated connectionswith the common attribute node and deduplicating the replicatedconnections.

In some embodiments, the method includes identifying a change to theprivilege information and updating the privilege graph based on thechange. Updating the privilege graph may include adding or removing aconnection between nodes in the privilege graph.

In another embodiment, an apparatus is provided having one or morecomputer readable storage media and a processing system operativelycoupled with the one or more computer readable storage media. Programinstructions stored on the one or more computer readable storage media,when read and executed by the processing system, direct the processingsystem to extract identity information for a plurality of users from aplurality of identity environments and privilege information from aplurality of data environments. The program instruction further directthe processing system to form subgraphs for the identity environmentsand the data environments from the identity information and theprivilege information. Also, the program instructions direct theprocessing system to translate the subgraphs into a canonical schemaand, after translating the subgraphs, combine the subgraphs into aprivilege graph.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 illustrates an implementation for generating a privilege graphrepresenting data access authorizations.

FIG. 2 illustrates an operation for generating a privilege graphrepresenting data access authorizations.

FIG. 3 illustrates a privilege graph generated to represent data accessauthorizations.

FIG. 4 illustrates a privilege graph generated to represent data accessauthorizations.

FIG. 5 illustrates subgraphs of a privilege graph generated to representdata access authorizations.

FIG. 6 illustrates an operation for generating a privilege graphrepresenting data access authorizations.

FIG. 7 illustrates an operation for generating a privilege graphrepresenting data access authorizations.

FIG. 8 illustrates a subgraph of a privilege graph generated torepresent data access authorizations.

FIG. 9 illustrates an operation for generating a privilege graphrepresenting data access authorizations.

FIG. 10 illustrates a computing architecture generating a privilegegraph representing data access authorizations.

DETAILED DESCRIPTION

Modern enterprises use numerous data environments to store, manage,and/or process data and those environments may be managed by differentsystems, applications, and/or platforms from different providers andeach may use its own data repository (e.g., database). For instance,different departments may employ different database systems depending onthe features offered by the respective system (e.g., accounting may usea first database system while human resources uses a second). In somecases, a single department may itself use multiple platforms for datarepositories depending on the capabilities of each platform even if theplatforms manage similar data sets. For example, human resources may useone platform to onboard and terminate employees from the enterprisewhile another platform is used to handle employees' compensation andbenefits. The repositories may be hosted local to the enterprise (i.e.,at one or more of the enterprise's own facilities) or may be cloud basedand hosted by third parties. Likewise, the cardinality of the dataenvironments and the data therein can be very high (on the order ofthousands of individual elements, such as data tables, to which a usercan potentially access), which makes it very difficult (if notimpossible) for a human administrator to track which data can beauthorized by which users.

Each of the environments discussed above uses its own mechanisms toregulate which users have access to which features and which data. Thatis, the mechanisms regulate the privileges that each user has foraccessing each data environment and prevent users who are not authorizedto access certain features or data from doing so. As such, eachenvironment needs to receive information defining the privileges foreach user that is authorized to access at least a portion of thefeatures/data available therefrom. To track privileges across amultitude of data environments, the graphing service described hereinuses a privilege graph to track users and corresponding privileges. Theprivilege graph, when displayed to a user, graphically represents theassociations between authorizations, users, groups, etc. within anentity (e.g., the user's employer), which enables the user to easilycomprehend the nature of authorizations for the entity.

FIG. 1 illustrates implementation 100 for privilege graph-basedrepresentation of data access authorizations. Implementation 100includes graphing service 101, data environments 102, user terminal 103,and identity environments 104. Graphing service 101 and dataenvironments 102 communicate over respective communication links 111.Graphing service 101 and user terminal 103 communicate overcommunication link 112. Graphing service 101 and identity environments104 communicate over respective communication links 113. Whilecommunication links 111-113 are shown as direct links, communicationlinks 111-113 may include intervening systems, networks, and/or devices.Graphing service 101 executes on one or more computing systems, such asserver systems, having processing and communication circuitry to operateas described below. User terminal 103 is a user operated computingsystem, such as a desktop workstation, laptop, tablet computer,smartphone, etc., that user 141 uses to access data environments 102.

In operation, graphing service 101 generates privilege graph 131 towhich tracks authorizations defined in identity environments 104 andcorresponding ones of data environments 102. Identity environments 104include one or more systems that maintain information about users (e.g.,user identity information, user attributes, etc.) and information aboutwhich of data environments (including specific data/features therein)each user is allowed to access. Identity environments 104 may include anactive directory (AD) server, a privilege account management (PAM)system, human resources management system (HRMS), identity and accessgovernance (IAG) system, cloud-based identity access management system,or any other type of system that maintains the user informationdiscussed above. By tracking the authorization of many, if not all,users in an organization (e.g., business enterprise), privilege graph131 is able to not only represent authorizations for particular usersbut also represent authorizations based on attributes of users (e.g.,the user's role and/or group). Privilege graph 131 may be stored localto graphing service 101 or may be accessible to graphing service 101from an external data repository, which may itself be managed by one ofdata environments 102. Graphing service 101 performs operation 200,described below, to generate privilege graph 131 from informationobtained from data environments 102 and identity environments 104.

FIG. 2 illustrates operation 200 for privilege graph-basedrepresentation of data access authorizations. In operation 200, graphingservice 101 extracts identity information (active directory information,cloud-based identity access management information, PAM information,etc.) for a plurality of users from identity environments 104 andprivilege information from data environments 102 (201). To access dataenvironments 102 and identity environments 104, graphing service 101 mayuse credentials that indicate graphing service 101 is allowed to accessthe identity information and privilege information on data environments102 and identity environments 104, respectively. The credentials may beprovided to graphing service 101 by an administrator(s) of dataenvironments 102 and identity environments 104. In some examples, thecredentials may only provide read-only access to graphing service 101 soas to ensure graphing service 101 cannot modify the identity informationor the privilege information.

The users may be all users in an organization (e.g., a businessenterprise) or may be a subset thereof. The users may include humanusers, such as user 141, that access data environments 102 and/ornon-human users (such as applications, micro-services, etc.) and/oridentity environments 104 through their respective user terminals, suchas user terminal 103. Although, the users could include one or morecomputing systems, applications, services, and/or other type ofnon-human component that could access one or more of data environments102 with proper privileges. Users may be identified in the identityinformation using an identifier for the particular user (e.g., name ofthe user, username of the user, employee identifier, etc.), includingmachine IDs, app IDs, etc. for non-human users. In some examples, theidentity of a non-human user (e.g., service or application) may be tiedto a human user in charge of the non-human user (e.g., the human userwho “owns” the service/application) and the privileges of one may,therefore, be synonymous with (or dependent upon) the privileges of theother. The identity information may also include attributes for therespective users. The attributes may indicate a work group for a user, ajob title/role for a user, a seniority of the user, a security clearancelevel for the user, or any other type of attribute that may affect whatdata environments user 141 can access. The privilege informationindicates which respective users are allowed to access which ones ofdata environments 102. In some examples, the privilege information mayfurther indicate specific features and/or data within each of dataenvironments 102 respective users are allowed to access. For example,one of data environments 102 may be a data repository that includesmultiple data tables therein and a user may only be allowed to access asubset of those data tables.

Graphing service 101 forms subgraphs for the identify environments andthe data environments from the identity information and the privilegeinformation (202). Each subgraph corresponds to distinct sets ofidentity and privilege information. For example, one identityenvironment of identity environments 104 may indicate one set ofattributes for users while another identity environment indicatesanother set of attributes. A subgraph would be created by graphingservice 101 for each respective set. In particular, an example subgraphof one of the sets would indicate attributes (e.g., groups) in theidentity environment's identity information as nodes with branchesconnecting to users in each group. Likewise, each of data environments102 indicates attributes, such as roles, with access privileges thereto.A subgraph of each data environment includes a node for the dataenvironment, or nodes for specific features/data of the dataenvironment, with branches therefrom to roles having access thereto.

The identity information and privilege information from the respectivedata environments 102 and identity environments 104 may use differentschemas that represent information differently. For example, one systemmay use one terminology to indicate privileges (e.g., “column read”)while another system uses a different terminology (e.g., “read column”).In another example, one environment may use one title for a particularrole while another environment may use a different title for the samerole. To account for those different schemas, graphing service 101translates the subgraphs into a canonical schema (203). The canonicalschema may be one of the schemas used by data environments 102 and/oridentity environments 104 or may be a schema that is unique to graphingservice 101. After translating the subgraphs, information that is thesame across multiple subgraphs will be represented in the same manner(e.g., user's will be identified using the same name/identifier, roleswill have the same name/identifier, etc.).

After the translating, graphing service 101 combines the subgraphs intoprivilege graph 131 (204). Since the subgraphs are now all using thecanonical schema, the information represented by those subgraphs can becombined into privilege graph 131. For example, multiple subgraphs mayinclude a node for a particular role but at least one of those subgraphsdiffers from the others regarding which users branch from that role.Instead of having multiple nodes for that same role, privilege graph 131includes only one node for that role and a branching node for each userin the subgraphs regardless of how many times that user appeared acrossthe subgraphs. Other subgraphs may indicate ones of data environments102 to which the users in that role have access. The role node wouldthen also branch to those ones of data environments 102. In someexamples, not every user in the role will have access to the same dataenvironments 102. In those cases, graphing service 101 may determineother attributes common to the users of each respective same dataenvironment sets. Rather than branching from the role directly to theenvironment sets, the role node would branch to the two or more otherdetermined attributes before then branching to the ones of dataenvironments 102 each user having the other attributes is allowed toaccess. In some examples, specific data/features of the dataenvironments 102 may be indicated as being accessible rather than dataenvironments as a whole.

To combine the subgraphs into privilege graph 131, for each dataenvironment 102, graphing service 101 may include definitions of theauthentication mechanism being used for users to access each of dataenvironments 102. Using the authentication mechanism, graphing service101 retrieves the corresponding authentication entity (including bothname and properties). An authentication entity is anything that providedwith authorization to access ones of data environments 102, such as agroup of users (e.g., marketing group, sales group, etc.) or individualusers themselves. Graphing service 101 uses the authentication entity toquery identity environments 104 to get the corresponding entity of theidentity environments and graphing service 101 creates an edge betweenthe entity in the data environment and the entity in the identityenvironment. For example, a database in data environments 102 could havea database role that can be connected to the Active Directory (AD) groupin an AD server of identity environments 104.

When the subgraphs are combined, user 141 may operate user terminal 103to view privilege graph 131 from graphing service 101. Privilege graph131 allows user 141 to visualize all the users having access toparticular ones of data environments 102 and what attributes those usershave. If privilege graph 131 is organized spatially with users on theleft and data environments 102 on the right, user 141 can traverseprivilege graph 131 from a selected user to the ones of dataenvironments 102 the selected user can access through nodes representingattributes of the selected user. Privilege graph 131 may also be usedfor purposes other than visualizing user access privileges. For example,an automated privilege assignment system may use privilege graph 131 todetermine which of data environments 102 a new user should be allowed toaccess.

FIG. 3 illustrates privilege graph 300 for privilege graph-basedrepresentation of data access authorizations. Privilege graph 300 is anexample of privilege graph 131. Data environments 301 are examples ofdata environments 102. Data environments 301, in this example, includedatabases, such as Online Transaction Processing (OLTP) and OnlineAnalytical Processing (OLAP) databases, files, applications, andcomputing resources. Nodes 302 are at a level in the privilege graphthat points to particular features 311 of data environments 301 that areaccessible to users having attributes that led to respective ones ofnodes 302 during traversal of privilege graph 300. Nodes 303 are nodesat a level prior to reaching nodes 302 and represent different rolesthat a user may have. Similarly, nodes 304 are at a level prior toreaching nodes 303 and represent different groups in which a user may beincluded. The level before nodes 304 is a level with nodes 305, whichrepresent the users themselves. When a user in nodes 305 has aparticular attribute (e.g., is in a particular group), a branch from thenode 305 for that user is displayed to a node of nodes 304 representingthat attribute. From that node 304, branches are displayed to nodes ofnodes 303 that represent other attributes (e.g., roles) that users inthe node 304 have. From one of the nodes 303 to which one of thosebranches terminated, branches are displayed to nodes of nodes 302 thatrepresent other attributes (e.g., privileges) that the users in the node303 have. As can be seen on privilege graph 300, the branches from nodes305 may direct to any one of nodes 302-304 because different types ofusers may not have certain attributes (e.g., may not belong to groups orhave a role). Likewise, a user, like the IAM principal node of nodes305, may branch to different levels of nodes.

Privilege graph 300 may be presented to a user by graphing service 101.For example, user 141 may be an administrator that has a need or desireto view the overall landscape of data environment authorizationsrepresented by privilege graph 300. User 141 may operate user terminal103 to request privilege graph 300 from graphing service 101 via agraphical user interface (GUI) to graphing service 101 (e.g., aweb-based application or native application). User terminal 103 displaysprivilege graph 300 to user 141 through the GUI. Being able to viewprivilege graph 300, rather than privilege graph 300 simply beingrepresented in memory, allows user 141 to more easily view whichattributes of users lead to those users having access to particular onesof data environments 102 (and features/data therein), which arerepresented as the databases, files, folders, applications, computeelements, online transaction processing (OLTP), and online analyticalprocessing (OLAP) elements on the right side of privilege graph 300.

When viewing privilege graph 300, user 141 may notice that users havingcertain attribute(s) or combinations of attribute(s) are currentlyauthorized to access a particular data environment to which they shouldnot have access. User 141 may then instruct graphing service 101 todeauthorize those users from accessing the particular data environmentor user 141 may use user terminal 103 to deauthorize the users fromaccessing the data environment. In either situation, graphing service101 will update privilege graph 300 after the users are deauthorized toreflect the fact that the users are not authorized to access the dataenvironment. In some cases, privilege graph 300 may track how privilegeschange over time. Thus, in the above example, user 141 may be able to“look back in time” to see that the users were once able to access thedata environment that they were deauthorized from accessing.

Additionally, privilege graph 300 may only be one level of details thatuser 141 is able to view with respect to the privileges depictedthereby. The GUI for graphing service 101 may further allow user 141 tospecify what information user 141 wishes to view. For example, whileprivilege graph 300 shows which user attributes result in authorizationto which data environments, user 141 may desire to see which specificusers are allowed to access a particular data environment. Uponspecifying that desire to graphing service 101, privilege graph 300 maychange in the GUI to show specific users as nodes branching from a noderepresenting the particular data environment. In an alternative example,user 141 may specify the they desire to view which attributes of usersallow those users to access a particular data environment and nodesrepresenting those attributes may then be displayed branching from thedata environment. Of course, other authorization relationships may bepresented using privilege graph 300 as well.

Even further uses of privilege graph 300 are envisioned, including Databased dynamic role assignments (e.g., assigning a role to a user basedon the data that the role can access), Risk Scores (e.g., assigning ascore representing how at risk certain data is for being accessed by anunwanted user), Tagging, Least Privilege Violations, anomaly detection(e.g., identifying users, roles, etc. that should not have access tocertain data even though they currently are authorized to do so),monitoring, recommendations, audit reporting, etc.

FIG. 4 illustrates privilege graph 400 for privilege graph-basedrepresentation of data access authorizations. Privilege graph 400 isanother example of privilege graph 131. In this example, user 141 is anadministrator to which privilege graph 400 presents a high-leveloverview of which users have access to which of data environments,including data systems 451, applications 452, and computing resources453. By tracing through the connections between nodes from left toright, user 141 can see which attribute combinations (i.e., groups,roles, etc.) are currently being allowed to access which data and/orfeatures of the data environments. Graphing service 101 displaysprivilege graph 400, in this example, through a display of user terminal103, which may execute an application for interacting with graphingservice 101 or access graphing service 101 through a web-basedinterface.

In this example, the users whose access privileges are represented byprivilege graph 400 are employees 401 and applications 402, althoughother types of users may be included in other examples. Employees 401and applications 402 may represent the entirety of users under thepurview of user 141 or may be only a subset (e.g., user 141 may beresponsible for all users in an enterprise or just a subset thereof).When looking at privilege graph 400, user 141 can determine, based onthe connections between user nodes and group nodes, that one or more ofemployees 401 are in groups 411-413 and one or more of applications 402are in groups 413-414. In some cases, an individual user may belong tomore than one of the groups. As user 141 continues to move to the rightthrough privilege graph 400, user 141 follow the connections betweennodes for groups 411-414 and roles 421-426 to determine which of groups411-414 have users with which of roles 421-426. For example, group 412has connections to role 421, role 422, and role 423. Those connectionsindicate to user 141 that group 412 has users in each of those roles. Insome cases, one user may be in more than one of the roles.

Continuing right from nodes for roles 421-426, user 141 followsconnections to the nodes of privileges 431-434 to determine users inwhich of roles 421-426 have various privileges 431-434. For instance,there are connections from role 422, role 423, and role 426 toprivileges 432. As such, one or more users in each of those roles haveprivileges 432. The node for privileges 432 then connects to show whataccess is granted by privileges 432. In this case, privileges 432 onlyhave one connection to feature 444 of applications 452. Other privilegesenable access to multiple ones of features/data 441-446 (e.g.,privileges 431 enable access to data 442, data 443, and feature 445). Byviewing privilege graph 400 as a whole, user 141 may be able recognize aconnection between nodes that should or should not be in privilege graph400 and make changes accordingly. Had the users, attributes, andprivileges not been displayed in this manner, user 141 may never haverecognized the deficiency represented by the connection.

FIG. 5 illustrates subgraphs 500 of a privilege graph generated torepresent data access authorizations. Subgraphs 500 include subgraph 501and subgraph 502, which are subgraphs that are created and combined toform part of privilege graph 400, as described below. Subgraph 501 is asubgraph tracing privileges of a single employee 521 who has yet to beincluded in employees 401. Subgraph 502 is a subgraph tracing privilegesof employees already included in employees 401. Subgraph 502 may itselfbe a subgraph created from a combination of two or more subgraphs.Subgraphs 500 may be displayed to user 141 or may remain in memory ofgraphing service 101 while graphing service 101 creates subgraphs 500,combines subgraphs 500, and, eventually, produces the complete privilegegraph 400, as discussed below. In some examples, subgraphs 500 may bedisplayed upon request of user 141 should user 141 want to view adifferent level of detail within privilege graph 400.

FIG. 6 illustrates operation 600 for generating a privilege graphrepresenting data access authorizations. Operation 600 describes thecreation of an initial subgraph, subgraph 501 in this case, forcombination with other subgraphs to create privilege graph 400. Inoperation 600, graphing service 101 uses the identity information andthe privilege information to identify employee 521 and to determineattributes of employee 521. The attributes of employee 521 indicate thatemployee 521 is in group 411, has role 423, and has privileges 432,which allow employee 521 to access feature 444. Graphing service 101creates subgraph 501 from the attributes identified for employee 521. Inparticular, graphing service 101 creates a node representing employee521, as shown in subgraph 501 (601). Graphing service 101 furthercreates a node for each of group 411, role 423, privileges 432, andfeature 444 (602). The created nodes are then connected in sequenceuntil employee 521 is connected to feature 444 through attributes 411,423, and 432 to form subgraph 501 (603). A hierarchy of attributes maybe predefined so that graphing service 101 orders the attributes in adesired manner. In this example, from left to right in subgraph 501, agroup attribute is included before a role attribute. The hierarchy maybe defined based on the number of possible nodes that are possible in aparticular level. For instance, there may be fewer groups than there areroles, so roles are designated to come after groups in subgraph 501.Once subgraph 501 is completed, should subgraph 501 be displayed to user141, user 141 can easily trace the connections from employee 521 throughemployee 521's attributes to feature 444 that employee 521 can access.

Either before or after creation of subgraph 501, graphing service 101may translate the labels of the nodes in subgraph 501 to comply with acanonical schema. For example, role 423 may have a label that isdifferent than the label used for role 423 in privilege graph 400. Thus,role 423 will be relabeled in subgraph 501. For instance, role 423 mayindicate that employee 521 holds a “team leader” role but the canonicalschema refers to the team leader role as being a supervisor role. Thelabel of “team leader” would, therefore, be changed to “supervisor” sothat graphing service 101 can find corresponding supervisor nodes whencombining subgraphs, as described in more detail below.

FIG. 7 illustrates operation 700 for generating a privilege graphrepresenting data access authorizations. Operation 700 is an example ofhow subgraph 501 and subgraph 502 are combined into subgraph 800. Inoperation 700, graphing service 101 determines that both subgraph 501and subgraph 502 include role 423, privileges 432, and feature 444(701). Graphing service 101 then identifies the nodes for role 423,privileges 432, and feature 444 in subgraph 502 as being the commonnodes for combination (702). In this example, the nodes from subgraph502 are selected because subgraph 502 is a more complex subgraph (e.g.,has more nodes/connections) than subgraph 501. In other examples, thenodes of subgraph 502 may be selected for some other reason, includingat random/arbitrarily. The connections between the nodes for role 423,privileges 432, and feature 444 in subgraph 501 are then migrated tomaintain the same connections between the common nodes in subgraph 502(703). For example, a connection between the node for group 411 and thenode for role 423 exists in subgraph 501. That connection is migrated torun between the node for group 411 and the node for role 423 in subgraph502. During the migration, some connections will be replicated. Forexample, there is a connection between the node for role 423 and thenode for privileges 432 in both subgraph 501 and subgraph 502. Graphingservice 101 deduplicates those replicated connections (704). In someexamples, graphing service 101 deduplicates the connections by removingreplicated connections after the migration. In other examples, graphingservice 101 may check to see whether a connection already exists betweentwo nodes and, if so, refrains from migrating that connection.

FIG. 8 illustrates subgraph 800 of a privilege graph generated torepresent data access authorizations. Subgraph 800 is a resultingsubgraph after operation 700 has been performed on subgraphs 500. Thenode for employee 521 is still shown separately from the node foremployees 401. In some examples, operation 700 may further determinethat a node can be included in another node. Since employees 401represent employees of an enterprise and employee 521 is an employee ofthat enterprise, graphing service 101 may include employee 521 inemployees 401. In such a case, the connection between the node foremployee 521 and the node for group 411 will be migrated to connectbetween the node for employees 401 and the node for group 411, as is thecase in privilege graph 400.

FIG. 9 illustrates operation 900 for generating a privilege graphrepresenting data access authorizations. Operation 900 is an example ofhow privilege graph 400 may be modified when changes are made to theidentity information and privilege information. In operation 900,graphing service 101 identifies a change to the identity and privilegeinformation that indicates no user in role 422 has privileges 433 anylonger (901). In response to identifying the change, graphing service101 removes the connection in privilege graph 400 between the node forrole 422 and the node for privileges 433 (902). After removal, whenprivilege graph 400 is displayed that connection will no longer be inthe display. In some examples, if a node no longer has a connection,graphing service 101 may remove the node from privilege graph 400 (903).For example, if the connection from the node for role 425 to privileges433 is removed, there are no privileges remaining for role 425 and therole's node can be removed.

In other examples, the changes may justify a new connection be made. Forinstance, a user in group 411 may be give role 421. Graphing service 101may therefore create a connection from the node for group 411 to thenode for role 421. In some cases, graphing service 101 may create a newsubgraph that includes the change (e.g., a new subgraph for the employeethat was given role 421). That new subgraph may be merged into privilegegraph 400 using operation 600.

FIG. 10 illustrates computing architecture 1000 for privilegegraph-based representation of data access authorizations. Computingarchitecture 1000 is an example computing architecture for implementinggraphing service 101. A similar architecture may also be used for othersystems described herein, such as user terminal 103, althoughalternative configurations may also be used. Computing architecture 1000comprises communication interface 1001, user interface 1002, andprocessing system 1003. Processing system 1003 is linked tocommunication interface 1001 and user interface 1002. Processing system1003 includes processing circuitry 1005 and memory device 1006 thatstores operating software 1007.

Communication interface 1001 comprises components that communicate overcommunication links, such as network cards, ports, RF transceivers,processing circuitry and software, or some other communication devices.Communication interface 1001 may be configured to communicate overmetallic, wireless, or optical links. Communication interface 1001 maybe configured to use TDM, IP, Ethernet, optical networking, wirelessprotocols, communication signaling, or some other communicationformat—including combinations thereof.

User interface 1002 comprises components that interact with a user. Userinterface 1002 may include a keyboard, display screen, mouse, touch pad,or some other user input/output apparatus. User interface 1002 may beomitted in some examples.

Processing circuitry 1005 comprises microprocessor and other circuitrythat retrieves and executes operating software 1007 from memory device1006. Memory device 1006 comprises a computer readable storage medium,such as a disk drive, flash drive, data storage circuitry, or some othermemory apparatus. In no examples would a storage medium of memory device1006 be considered a propagated signal. Operating software 1007comprises computer programs, firmware, or some other form ofmachine-readable processing instructions. Operating software 1007includes access graphing module 1008. Operating software 1007 mayfurther include an operating system, utilities, drivers, networkinterfaces, applications, or some other type of software. When executedby processing circuitry 1005, operating software 1007 directs processingsystem 1003 to operate computing architecture 1000 as described herein.

In particular, graphing module 1008 directs processing system 1003 toextract identity information for a plurality of users from a pluralityof identity environments and privilege information from a plurality ofdata environments. Graphing module 1008 further directs processingsystem 1003 to form subgraphs for each of the identify environments andeach of the data environments from the identity information and theprivilege information. Also, graphing module 1008 directs processingsystem 1003 to translate the subgraphs into a canonical schema and,subsequently, combine the subgraphs into the privilege graph.

The descriptions and figures included herein depict specificimplementations of the claimed invention(s). For the purpose of teachinginventive principles, some conventional aspects have been simplified oromitted. In addition, some variations from these implementations may beappreciated that fall within the scope of the invention. It may also beappreciated that the features described above can be combined in variousways to form multiple implementations. As a result, the invention is notlimited to the specific implementations described above, but only by theclaims and their equivalents.

What is claimed is:
 1. A method for generating a privilege graphrepresenting data access authorizations, the method comprising:extracting identity information for a plurality of users from aplurality of identity environments and privilege information from aplurality of data environments; forming subgraphs for the identityenvironments and the data environments from the identity information andthe privilege information; translating the subgraphs into a canonicalschema; and after translating the subgraphs, combining the subgraphsinto the privilege graph.
 2. The method of claim 1, comprising:displaying the privilege graph to an administrator authorized to viewthe privilege graph.
 3. The method of claim 1, wherein forming thesubgraphs comprises: creating a user node for a user of the plurality ofusers and sequentially connecting the user node to one or more attributenodes that each represent an attribute of the user indicated in theidentity information.
 4. The method of claim 3, comprising: uponreaching a last attribute node of the one or more attribute nodes,connecting the last attribute node to a privileges node; and connectingthe privileges node to one or more nodes of authorized data environmentsof the plurality of data environments that the user is authorized toaccess.
 5. The method of claim 4, wherein the one or more nodes ofauthorized data environments each represent data or a feature that theuser is authorized to access.
 6. The method of claim 1, whereintranslating the subgraphs comprises: for attribute nodes of thesubgraphs, changing attribute labels representing attributes of a userto canonical labels defined by the canonical schema.
 7. The method ofclaim 1, wherein combining the subgraphs comprises: for an attributerepresented by attribute nodes in multiple subgraphs, identifying acommon attribute node and migrating connections with the attribute nodesto the common attribute node.
 8. The method of claim 7, comprising:identifying replicated connections with the common attribute node; anddeduplicating the replicated connections.
 9. The method of claim 1,comprising: identifying a change to the privilege information; andupdating the privilege graph based on the change.
 10. The method ofclaim 9, wherein updating the privilege graph comprises: adding orremoving a connection between nodes in the privilege graph.
 11. Anapparatus comprising: one or more computer readable storage media; aprocessing system operatively coupled with the one or more computerreadable storage media; and program instructions stored on the one ormore computer readable storage media that, when read and executed by theprocessing system, direct the processing system to: extract identityinformation for a plurality of users from a plurality of identityenvironments and privilege information from a plurality of dataenvironments; form subgraphs for the identity environments and the dataenvironments from the identity information and the privilegeinformation; translate the subgraphs into a canonical schema; and aftertranslating the subgraphs, combine the subgraphs into a privilege graph.12. The apparatus of claim 11, wherein the program instructions directthe processing system to: display the privilege graph to anadministrator authorized to view the privilege graph.
 13. The apparatusof claim 11, wherein to form the subgraphs, the program instructionsdirect the processing system to: create a user node for a user of theplurality of users and sequentially connect the user node to one or moreattribute nodes that each represent an attribute of the user indicatedin the identity information.
 14. The apparatus of claim 13, wherein theprogram instructions direct the processing system to: upon reaching alast attribute node of the one or more attribute nodes, connect the lastattribute node to a privileges node; and connect the privileges node toone or more nodes of authorized data environments of the plurality ofdata environments that the user is authorized to access.
 15. Theapparatus of claim 14, wherein the one or more nodes of authorized dataenvironments each represent data or a feature that the user isauthorized to access.
 16. The apparatus of claim 11, wherein totranslate the subgraphs, the program instructions direct the processingsystem to: for attribute nodes of the subgraphs, change attribute labelsrepresenting attributes of a user to canonical labels defined by thecanonical schema.
 17. The apparatus of claim 11, wherein to combine thesubgraphs, the program instructions direct the processing system to: foran attribute represented by attribute nodes in multiple subgraphs,identify a common attribute node and migrate connections with theattribute nodes to the common attribute node.
 18. The apparatus of claim17, wherein the program instructions direct the processing system to:identify replicated connections with the common attribute node; anddeduplicate the replicated connections.
 19. The apparatus of claim 11,wherein the program instructions direct the processing system to:identify a change to the privilege information; and update the privilegegraph based on the change.
 20. One or more computer readable storagemedia having program instructions stored thereon that, when read andexecuted by a processing system, direct the processing system to:extract identity information for a plurality of users from a pluralityof identity environments and privilege information from a plurality ofdata environments; form subgraphs for the identity environments and thedata environments from the identity information and the privilegeinformation; translate the subgraphs into a canonical schema; and aftertranslating the subgraphs, combine the subgraphs into a privilege graph.